Rank in Wordlist | Frequency | Word |
---|---|---|
3960 | 40 | 2,5 |
4138 | 38 | 1,5 |
6443 | 22 | 3,5 |
6659 | 21 | 1,2 |
7873 | 17 | 4,5 |
8199 | 16 | 0,5 |
9102 | 14 | 1,8 |
10292 | 12 | 1,3 |
11021 | 11 | 1,6 |
11047 | 11 | 2,6 |
Rank in Wordlist | Frequency | Word |
---|---|---|
5006 | 30 | 20% |
5804 | 25 | 10% |
6219 | 23 | 50% |
6224 | 23 | 80% |
6671 | 21 | 40% |
6675 | 21 | 90% |
6918 | 20 | 60% |
7513 | 18 | 5% |
8216 | 16 | 25% |
8226 | 16 | 30% |
Rank in Wordlist | Frequency | Word |
---|---|---|
28595 | 3 | D&M |
30631 | 3 | S&M |
37027 | 2 | B&D |
43026 | 2 | SAS&H |
58155 | 1 | AT&T |
60059 | 1 | Armour&Company |
60558 | 1 | B&Bs |
63635 | 1 | C&A |
63636 | 1 | C&T |
63727 | 1 | CD&V |
Rank in Wordlist | Frequency | Word |
---|---|---|
89602 | 1 | S$500 |
96157 | 1 | US$200 |
96158 | 1 | US$25 |
96648 | 1 | VS$1 |
96649 | 1 | VS$10 |
96650 | 1 | VS$1199 |
96651 | 1 | VS$120 |
96652 | 1 | VS$130 |
96653 | 1 | VS$20 |
96654 | 1 | VS$40 |
Rank in Wordlist | Frequency | Word |
---|---|---|
483 | 352 | ." |
Rank in Wordlist | Frequency | Word |
---|---|---|
1966 | 92 | 1960's |
1967 | 92 | 1970's |
2091 | 85 | 1980's |
2253 | 79 | foto's |
2576 | 67 | 1990's |
2850 | 60 | 1950's |
3356 | 49 | 1930's |
3656 | 44 | 1920's |
4114 | 39 | s'n |
4141 | 38 | 1940's |
Rank in Wordlist | Frequency | Word |
---|---|---|
27685 | 3 | 2-6-2+2-6-2 |
55997 | 1 | 2-8-2+2-8-2 |
56894 | 1 | 4-8-2+2-8-4 |
79511 | 1 | M31VJ00443799+4129236 |
88930 | 1 | Rio+20-konferensie |
96166 | 1 | UTC+00:20 |
96167 | 1 | UTC+08:00 |
96168 | 1 | UTC+09:00 |
96169 | 1 | UTC+0:30 |
96170 | 1 | UTC+12 |
Rank in Wordlist | Frequency | Word |
---|---|---|
3175 | 53 | km/h |
3990 | 40 | en/of |
9483 | 14 | km/s |
12661 | 10 | sy/haar |
16346 | 7 | https://www |
18528 | 6 | hy/sy |
18711 | 6 | m/s |
20397 | 5 | NuPower/Tuksreeks |
21485 | 5 | hom/haar |
21508 | 5 | http://esat |
In the last subsection of this type we look for words containing other special characters: , ( ) % & $
" ' + * = / _
Depending on the language some of these characters may be allowed within words, other will not. If words with forbidden characters do not have very low frequency there might be a problem in preprocessing.
Words containing %:
select w_id-100,freq, word from words where w_id>100 and word like "%\%%" limit 10;
3.12.1 Words with Hyphens
3.12.2 Multiwords
3.12.3 (Multi-)Words with dots